Policy-Based Primal-Dual Methods for Convex Constrained Markov Decision Processes

نویسندگان

چکیده

We study convex Constrained Markov Decision Processes (CMDPs) in which the objective is concave and constraints are state-action occupancy measure. propose a policy-based primal-dual algorithm that updates primal variable via policy gradient ascent dual projected sub-gradient descent. Despite loss of additivity structure nonconvex nature, we establish global convergence proposed by leveraging hidden convexity problem, prove O(T^-1/3) rate terms both optimality gap constraint violation. When strongly measure, an improved O(T^-1/2). By introducing pessimistic term to constraint, further show zero violation can be achieved while preserving same for gap. This work first one literature establishes non-asymptotic guarantees methods solving infinite-horizon discounted CMDPs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Primal-dual subgradient methods for convex problems

In this paper we present a new approach for constructing subgradient schemes for different types of nonsmooth problems with convex structure. Our methods are primaldual since they are always able to generate a feasible approximation to the optimum of an appropriately formulated dual problem. Besides other advantages, this useful feature provides the methods with a reliable stopping criterion. T...

متن کامل

Primal–dual Methods for Nonlinear Constrained Optimization

. . . If a function of several variables should be a maximum or minimum and there are between these variables one or several equations, then it will be suffice to add to the proposed function the functions that should be zero, each multiplied by an undetermined quantity, and then to look for the maximum and the minimum as if the variables were independent; the equation that one will find combin...

متن کامل

Constrained Markov Decision Processes

2 i To Tania and Einat ii Preface In many situations in the optimization of dynamic systems, a single utility for the optimizer might not suuce to describe the real objectives involved in the sequential decision making. A natural approach for handling such cases is that of optimization of one objective with constraints on other ones. This allows in particular to understand the tradeoo between t...

متن کامل

A Primal-Dual Algorithmic Framework for Constrained Convex Minimization

We present a primal-dual algorithmic framework to obtain approximate solutions to a prototypical constrained convex optimization problem, and rigorously characterize how common structural assumptions affect the numerical efficiency. Our main analysis technique provides a fresh perspective on Nesterov’s excessive gap technique in a structured fashion and unifies it with smoothing and primal-dual...

متن کامل

Primal-dual Algorithm for Convex Markov Random Fields

Computing maximum a posteriori configuration in a first-order Markov Random Field has become a routinely used approach in computer vision. It is equivalent to minimizing an energy function of discrete variables. In this paper we consider a subclass of minimization problems in which unary and pairwise terms of the energy function are convex. Such problems arise in many vision applications includ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i9.26299